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Abstract 

In this paper, we address the feasibility of 
partitioning rule-based systems into a num- 
ber of meaningful units to enhance the com- 
prehensibility, maintainability and reliabil- 
ity of expert systems software. Prelimi- 
nary results have shown that no single stvuc - 
tuning principle or abstraction hierarchy is 
sufficient to understand complex knowledge 
bases . We therefore propose the Multi- 

View Point - Clustering Analysis (MVP-CA) 
methodology to provide multiple views of 
the same expert system. We present the re- 
sults of using this approach to partition a 
deployed knowledge-based system that nav- 
igates the Space Shuttle’s entry. We also 
discuss the impact of this approach on ver- 
ification and validation of knowledge-based 
systems. 

Keywords domain knowledge, primary 
view, secondary view, conceptual clustering. 

Introduction 

Knowledge-based systems owe their appeal 
to the promise of utilizing expertise in the 
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domain knowledge for the solution of diffi- 
cult, poorly- understood, ill-structured prob- 
lems. However, they must be subjected to 
rigorous verification and validation (V&V) 
analyses before they can be accepted into 
real-world critical applications. Unfortu- 
nately, expert systems do not lend them- 
selves to the traditional V&V techniques for 
highly reliable software. There is a need to 
formulate an acceptable set of V&V tech- 
niques which can assure their quality. Better 
knowledge-acquisition techniques as well as 
better management, understanding and en- 
hancement of the knowledge base is critical 
to the success of such V&V activities. 

The difficulty in the V &' V of large 
knowledge-based systems arises due to a 
number of reasons. Firstly, rapid prototyp- 
ing and iterative development form key fea- 
tures of any expert system development ac- 
tivity. This has led to the development of 
ad-hoc techniques for expert system design 
without any software engineering guidelines. 
Moreover, due to the data-driven nature of 
expert systems, as the number of rules of an 
expert system increase, the number of possi- 
ble interactions between the rules increases 
exponentially. The complexity of each pat- 
tern in a rule compounds the problem of 
V&V even further. As a result, laxge ex- 
pert systems tend to be incomprehensible, 
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difficult to debug or modify, and almost im- 
possible to verify or validate. 

Compounding the problem further is the 
fact that most expert systems are built with- 
out much regard to defining the require- 
ments or specifications upfront. As any soft- 
ware, conventional or knowledge-based, be- 
comes more complex, common errors are 
bound to occur through misunderstanding of 
specifications and requirements. Therefore, 
it is our belief that even if a software life cy- 
cle stresses specifications and requirements 
upfront, that will not be enough to guarantee 
the right product for complicated systems. 
There are bound to be ambiguities and in- 
terpretational problems. What is needed is a 
complementary tool that is capable of expos- 
ing such ambiguities and misinterpretations 
so that corrective action can be taken be- 
fore it is too late in the software life cycle. 
Having a semi-automated means for captur- 
ing and structuring the meta-knowledge in a 
rulebase and cross-checking it with the spec- 
ifications and requirements at various stages 
of the software life cycle could certainly help 
in this effort. 

Conventional software yields more easily 
to verification efforts because control is ex- 
plicitly represented as procedures which can 
be structured to encapsulate run-time ab- 
stractions. Modules can be designed in con- 
ventional software, each consisting of a man- 
ageable unit with a well-defined interface. 
Furthermore, procedures can be grouped 
into packages or objects which share an 
internal data structure. These units can 
then be subjected to unit /integration test- 
ing techniques. 

Due to the declarative style of program- 
ming in knowledge-based systems, the gen- 
eration of clusters to capture significant con- 
cepts in the domain seems more feasible than 
it would be for procedural software. By 


using knowledge-based programming tech- 
niques one is much closer to the domain 
knowledge of the problem than with pro- 
cedural languages. The control aspects of 
the problem are abstracted away into the in- 
ference engine (or alternatively, the control 
rules are explicitly declared). The existence 
of a model of the domain would benefit the 
analysis of other knowledge-based systems 
within that domain by providing seeds for 
cluster formation. In addition, the use of a 
domain model to assist in the development of 
new knowledge-based systems is a promising 
research direction. 

Existing research indicates that misunder- 
standings of the domain are a primary cause 
of systems failures [5, 12, 19]. Often small 
oversights or misunderstood interactions be- 
tween sources of expertise lead to catas- 
trophic failures. Techniques, methodologies 
and supporting tools are therefore needed 
to manage a complex system from multiple 
viewpoints and discover subtle interrelating 
concepts that are so critical for assuring the 
reliability of these systems. Even though 
language support for systems structuring has 
long been recognized as a key aspect of mod- 
ern software and knowledge engineering, it is 
our contention that no single structuring can 
simultaneously capture all the important con- 
cepts in complex knowledge-based systems. 
We believe that techniques, methodologies 
and supporting tools are needed to manage 
a complex system from multiple viewpoints 
and that the discovery of subtle interrelating 
concepts is critical for assuring the reliability 
of these systems. 

In this paper, we propose the concept of 
Multi-Viewpoint Clustering Analysis (MVP- 
CA) and show it as a feasible and effective 
technique towards structuring a rulebase for 
capturing its explicit as well as its implicit 
knowledge. The extraction of implicit, pre- 
viously unknown, yet potentially useful in- 
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formation from the rulebase can have con- 
siderable impact on various stages of the life 
cycle of knowledge-based systems software. 
It can expose various design pitfalls during 
construction of the rulebase and the func- 
tional limitations of the software during its 
operation, as well as the subtle interrelation- 
ships between subgroups of rules that could 
prove very valuable in the maintenance of 
the system. It is our contention that the un- 
derstanding of any large knowledge base will 
require that it be viewed from several differ- 
ent, possibly orthogonal viewpoints. MVP- 
CA provides an ability to discover signifi- 
cant structures within the rulebase by pro- 
viding a mechanism to structure both hierar- 
chically (from detail to abstract) and orthog- 
onally (from different perspectives). More- 
over, transfer of expertise from one prob- 
lem domain to another related domain would 
be facilitated through the factoring of com- 
mon aspects across the domains. Hence soft- 
ware reuse can be exploited through multiple 
structuring of a knowledge-based system. 

First, we give an overview of our approach, 
followed by the methodology used to gener- 
ate meaningful partitions. Next, we present 
the results of applying this methodology to 
a deployed expert system for navigation. We 
discuss some of the related work in this area 
and finally give our conclusions. 

MVP-CA Overview 

Our research efforts address the feasibility of 
automating the identification of rule-groups 
in knowledge-based systems software, to re- 
flect the underlying subdomains of the prob- 
lem. We prove the feasibility of MVP- 
CA (Multi- Viewpoint Clustering Analysis) 
methodology by building an MVP-CA tool 


to structure a few CLIPS 2 [3] knowledge- 
based systems along several viewpoints and 
showing that no single structuring principle 
or abstraction hierarchy is sufficient to un- 
derstand complex knowledge bases. 

Our approach utilizes clustering analysis 
techniques to group rules which share signif- 
icant common properties and to identify the 
concepts which underlie these groups. Clus- 
ter analysis is a kind of unsupervised learn- 
ing in which (a potentially large volume of) 
information is grouped into a (usually much 
smaller) set of clusters. If a simple descrip- 
tion of the cluster is possible, then this de- 
scription emphasizes critical features com- 
mon to the cluster elements while suppress- 
ing irrelevant details. Thus, clustering has 
the potential to abstract from a large body 
of data, a set of underlying principles or con- 
cepts which organizes that data into mean- 
ingful classes. The knowledge acquisition 
process therefore involves “mining” the rule 
base for interesting concepts shared among 
the rules. The quality of clustering is related 
to two competing factors: intra-group cohe- 
siveness and inter-group coupling. Infor- 
mally, one can say that a group (or a cluster) 
is cohesive if all the items clustered together 
are somehow related or similar. Two groups 
are highly coupled if they share many sim- 
ilar properties and they are loosely coupled 
(possibly decoupled) if they share few (or no) 
similar properties. It is interesting to note 
that the qualities which define a good cluster 
are precisely those which define a good mod- 
ular functional decomposition of a problem. 

Preliminary experiments with the MVP- 
CA tool exposed significant natural struc- 
tures within different knowledge bases. For 
example, consider ONAV (Onboard Navi- 
gation Expert System) [1], an expert sys- 
tem deployed on the shuttle to navigate dur- 

2 C Language Production System 
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ing re-entry. The file structure of ONAV 
provides one partitioning of the whole sys- 
tem. Not only did we find this generally 
accepted partitioning of ONAV, but we also 
found less obvious, more subtle interrelation- 
ships that existed across these primary clus- 
terings. In this paper we present some of 
our results of applying the MVP-CA tool 
to ONAV. Misunderstandings of subtle in- 
teractions contribute most to the unreliabil- 
ity of knowledge-based systems [10]. Hence 
any methodology that exposes these rela- 
tionships will contribute towards the V&V 
of large knowledge-based systems. 

To illustrate the need for multiple view- 
points, consider an expert system for select- 
ing the appropriate wine to complement a 
dinner. Even such a relatively small rule- 
base can be structured from several differ- 
ent viewpoints, as shown in Figure 1. Very 
broadly, the knowledge base can be divided 
into knowledge about the problem domain 
(selecting the appropriate wine) and knowl- 
edge about the control domain. The control 
knowledge breaks up further into user inter- 
face (how to question the user) and over- 
all control strategies (balancing user prefer- 
ences against experts’ opinion through var- 
ious phase control rules). Printout state- 
ments that ask the user for input or control 
the phasing of control rules belong to the 
control domain. 

Similarly, knowledge about the problem 
domain, to aid in the selection of an appro- 
priate wine for a meal, can be further sub- 
divided into three major subdomains: types 
of food, wine properties and varieties, and a 
model of the customer’s preferences. These 
domains axe further subdivided into vari- 
ous subaspects. All these reflect different 
viewpoints of the same rule base. Within 
the food subdomain there are partitionings 
of taste of food, style of food, ingredients, 
etc. This is a hierarchical partitioning under 


the food subdomain. An orthogonal view- 
point in the wine subdomain is the inter- 
action of wine properties with meal qual- 
ities. Similarly there are different aspects 
of the problem from the customer’s view- 
point. In addition, there are rules which 
overlap subdomains or pass information to 
rules in other subdomains (data dependency 
relationships). Thus the same rule can be 
part of one subdomain and at the same 
time create information for use by rules in 
other subdomains, such as interface rules 
that specifically combine concepts from two 
subdomains (e.g., the relationship between 
beverage and the style of food.) There is 
an added value in using the MVP-CA tool 
for exposing substructures within the ab- 
stract groups formed, through hierarchical 
partitionings generated by it. The hierar- 
chies represent viewpoints at different levels 
of conceptual abstraction. 

MVP-CA Methodology 

The methodology used for MVP-CA is sum- 
marized graphically in Figure 2. In the Clus- 
ter Generation Phase the focus is on gener- 
ating meaningful clusters through statistical 
and semantics-based measures. In the Clus- 
ter Analysis Phase the focus is on performing 
a statistical and functional analysis of the 
output generated from the previous phase. 
Results of a statistical analysis of the out- 
put data feed back sis better constraints on 
the parameters for grouping to improve the 
quality of subsequent clusterings. A func- 
tional analysis of the clusters captures the 
key concepts conveyed by the clusters gen- 
erated. Concepts axe meaningful patterns 
in the rulebase along with their associated 
attributes. A set of key concepts consti- 
tutes a single viewpoint . Multiple clusterings 
present multiple viewpoints on the rule base. 
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PROBLEM DOMAIN 


CONTROL DOMAIN 



Figure 1: A Multi View Point of the Wine Rule Base 


Cluster Generation Phase 



Figure 2: Phase- 1 Data Flow Diagram 
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A two-step procedure is utilized for ex- 
tracting multiple viewpoints of a rulebase. 
First, form the best cluster possible using 
various measures, such as dispersion, cohe- 
sion and coupling. The overall dispersion of 
a pattern p is 


disp(p) = Y dis PG. (p) 

x=i 


where nc is the number of groups for clus- 
tering C and dispdip) = 1 if p e Gi and is 
0 otherwise. Coupling is defined in terms of 
the inter-group distance, D(i,j ) as follows: 


D(iJ) 


= £ E 

r k eG t rttGj 


d{r k ,r t ) 
rii * rij 


where n; and n : are the number of rules in 
groups Gi and Gj, respectively and d(r k , r;) 
is the distance between rules r*, and r* de- 
fined according to a distance metric selected 
by taking into account the nature of the rule 
base application [15]. For a given clustering, 
C, the cohesiveness measure is an index of 
the similarity of rules belonging to the same 
group. Cohesiveness of a rule with respect 
to the group Gi that it belongs to is the aver- 
age number of concepts( cncp) it shares with 
the other rule members in the group Gi . 

| 2 * comm.cncp(rk, ri ) | 

( r t cG t) 

(rk^r t ) 


| cncp(r k ) | + | cncp{ri ) | 


coh Gt (rk) 


Our clustering algorithm starts with all 
rules in their own clusters. At each step 
of the algorithm, the two groups which are 
most similar are merged together to form 
a new group. This pattern of mergings 
forms a hierarchical cluster from the single- 
member rule cluster to a cluster containing 
all the rules. One can look at this cluster- 
ing near the “best” clustering points. De- 
ciding which level in the hierarchy forms the 
“best” clustering of the rules requires an 


analysis of the cohesiveness of each cluster 
(the intragroup similarity) versus the cou- 
pling between groups (the intergroup sim- 
ilarity). When group cohesiveness is plot- 
ted against number of groups, plateau re- 
gions are generated signifying stable values 
for cohesiveness in certain ranges of number 
of groups. These regions represent optimal 
partitionings for a particular level of concep- 
tual abstraction. Insight into concepts dom- 
inating the various clusters can be obtained 
through an examination of the groups at se- 
lect points on the plateau regions. A hierar- 
chical view of the rulebase can then be gen- 
erated by repeating the above procedure for 
different plateau regions on the cohesiveness 
plots. 

Next, with this “best” cluster, form a con- 
cept focus list - to either sharpen a current 
viewpoint or expose an alternate viewpoint. 
The concept focus list is formed from dis- 
persion statistics of patterns. Dispersion is 
based on shared concepts - i.e. how a sin- 
gle concept is dispersed among the clusters. 
Low dispersion concepts are likely to repre- 
sent concepts which characterize the clusters 
they are in. In fact, high dispersion concepts 
may interfere with the generation of highly 
cohesive clusters. Removing these concepts 
before clustering can help define the clus- 
ters more distinctly - a process which we 
call “sharpening”. However, high dispersion 
concepts may also represent legitimate al- 
ternate structurings of the knowledge base. 
By selectively removing the low dispersion 
concepts, it is possible to reveal subtle alter- 
nate viewpoints - a concept we have termed 
multi-viewpoint clustering analysis [17, 16]. 
Thus the MVP-CA methodology provides 
a mechanism for comprehending complex 
knowledge-based systems through structur- 
ing them both hierarchically (from detail to 
abstract) and orthogonally (from different 
perspectives) leading to discovery of signif- 
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icant structures within the rule base. 


Experimental Results 

In this section we present some of the re- 
sults obtained to date with the deployed 
knowledge-based system ONAV. Other re- 
sults using animal classification and wine se- 
lection (available as part of the CLIPS 5.1 
release) expert systems have been presented 
in [16]. 

Even with extensive comments and a tool 
such as CRSV 3 [2], the conceptual depen- 
dencies of rules across files cannot be easily 
determined. Not having any experience with 
Shuttle mission terminology, the rulenames 
were our only guide for understanding the 
domain in this knowledge-base. After clus- 
tering this rulebase several times using dif- 
ferent criteria, we began to understand more 
of the subtle interrelationships. A graphical 
user interface, currently under development, 
would allow us to navigate through the rule- 
base and document the insights generated 
by the partitioning, thus fully utilizing the 
MVP-CA methodology. We document be- 
low our understanding of ONAV based on 
the natural partitionings set up by the devel- 
oper as well as different groupings generated 
through the MVP-CA tool. We also show 
some of the interrelated concepts uncovered 
by this tool. 

ONAV is an expert system developed at 
NASA Johnson to help navigate re-entry of a 
space craft. It has 387 rules divided across 16 
files reflecting the various stages of naviga- 
tion: ascent, entry and landing. The largest 
file tacan.r contains 127 rules. Monitoring of 
the space shuttle through ONAV entails up- 
dating some state vectors in the files state. r, 

3 CLIPS Cross Reference Style Analysis and Ver- 
ification Tool 


Sstate.r and hstd.r. Measurements of veloc- 
ity and acceleration are calculated through 
sensor readings from various devices such as 
the inertial measurement unit (imu), drag 
unit (drag), barometer unit (baro), tactical air 
navigation unit(tacan) and microwave scan 
beam landing system( msbls). The readings 
go through a Kalman filter and the state vec- 
tor is updated through different types of line 
replacement units ( Iru ) attached to the dif- 
ferent devices. The computers onboard per- 
form the necessary integrations on the cor- 
rected readings to obtain accurate values of 
velocity and position. 

During landing, readings from different 
sources have to be tallied so that the po- 
sitioning of the shuttle can be as accurate 
as possible before it hits the runway. Dur- 
ing ascent the shuttle relies mainly on the 
inertial measurement unit readings, since 
an accurate positional value is less criti- 
cal. All the Irus feed data to both the 
primary avionics system software(PASS) as 
well as to the backup flight system(BFS). 
Each of these systems have different selec- 
tion schemes for determining the quality of 
data received. Ground-based radar stations 
resolve any conflicting values for the position 
of the shuttle and are used to aid in isolat- 
ing malfunctioning equipment on board. Fi- 
delity of the data is monitored through the 
status of a number of different flags. Rules in 
telemetry. r and operator. r determine which 
of the readings and updated state vectors 
are reliable at any point in time and give 
the operator power to override any decision. 
Tables, r provides general information on the 
Iru configurations onboard, the fault matrix 
to be used for identifying the imu compo- 
nent that has failed, and a definition of the 
quality ratings to be used for the different 
state vectors and data readings. Runway se- 
lections axe checked out in the file runway. r. 
Rules in init.r , control. r, and output. r essen- 
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tially accomplish the initial set up of global 
information during the different stages of the 
navigation by activating the various phase 
control rules, and they also handle the user 
interface issues. 

Initial analysis of our results indicates that 
grouping a rulebase according to control as- 
pects of the problem is not sufficient for un- 
derstanding the problem. The static aspects 
of the problem can be understood only if 
domain knowledge can be separated from 
control knowledge [8, 9]. The original par- 
titioning of ONAV into 16 files by the de- 
veloper provided only a coarse partitioning 
based on the different phase aspects of the 
knowledge-based system. When the phase 
aspects of the rulebase were excised, it was 
found that rules with similar domain infor- 
mation were formed into a single group to 
give a secondary view. In order to discover 
the implicit interconnections between rules 
in different files, we combined all the files of 
ONAV to form one 387-rule rulebase. Since 
ONAV is primarily a monitoring system with 
some diagnostic capabilities, more meaning- 
ful paxtitionings were obtained when the an- 
tecedent patterns played a major role in de- 
termining the distance between rules [15]. 

Figure 3 shows the cohesion plot for a 
primary view of ONAV. The cohesion val- 
ues beyond 200 groups are not plotted be- 
cause there are too many single groups af- 
ter that point. Consider some of the inter- 
esting plateau regions such as those around 
11 and 50 groups. Partitionings generated 
with the primary view are more or less in 
accordance with the developer’s partition- 
ings in the rulebase reflecting various phase 
values. At 50 groups, we can see various 
subaspects for the tacan subphase - such as, 
tacan prediction rules, rules that put tacan 
in automatic mode, rules to determine Iru 
quality, and so on - grouped in separate 
groups. However, at 10 groups, all these 



Figure 3: Cohesiveness Plot: ONAV rule- 
base - Primary View 



Figure 4: Cohesiveness Plot: ONAV rule- 
base - Secondary View 
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Group no 20 : 

Total number of rules in group: 15 

Distance:: Min: 2.000000 Max: 7.666667 Mean: 4.284770 

Cohesiveness: 0.429254 Minimum Membership: 0.033520 

130 init-engaged-system-is-bf s 0 . 150933 

131 init-engaged-system-is-pass 0.445672 

134 init-system-availability-bf s-only 0 . 537089 

135 init-system-availability-pass-only 0.566508 

137 init-system-availability-both-pass-avail 0 . 561815 

136 init-system-availability-both 0 . 565853 

140 init-report-major-mode 0.451083 

138 init-wrong-atmosphere 0.399324 

139 init-right-atmosphere 0.371124 

132 init-enable-msbls-sensor-lights 0.232097 

133 init-enable-tacan-sensor-lights 0 . 295337 

141 init-keep-last-ops-num 0.362514 

142 init-report-abort-mode 0.501832 

143 init-report-ascent-events 0 . 544941 

223 nav-initialize 0.452685 


Figure 5: Initialization Rules - Primary Clustering 


tacan rules come together to form one group 
as conceived by the developer. Thus, while 
the original partitioning of ONAV into 16 
files by the developer provided a coarse par- 
titioning based on the different phase as- 
pects of the knowledge-based system, there 
is added value in using the MVP-CA tool 
to expose the substructures within these ab- 
stract groups. 

In the primary view, some groupings seem 
to have been generated based on criteria 
other than phase control. Initialization 
rules across different files come together in 
a group, group 20 in Figure 5, revealing ini- 
tialization relationships from various phases. 
Initializations from other files, such as nav- 
initialize from file state. r, combine with this 
group revealing initialization relationships 
across files. This is an important revelation 
from the point of view of maintenance and 
verification. 


In order to reveal a secondary view, we 
excised the concept of phase and engaged- 
system, which had the highest dispersion 
values in the primary view. The cohesion 
plot for the secondary view is given in Fig- 
ure 4. Figures 7 and 8 give cross-sections of 
secondary groupings when all phase values 
were excised. The rule labelings generated 
in these files axe the rulenames given by the 
developer originally. The numbers on the 
left axe the rule numbers; distance between 
rule numbers thus gives an indication of the 
degree of juxtaposition of the rules in the 
combined rule base. Right-hand side num- 
bers provide the cohesion value of the rule 
with respect to its group. 

Once the phase aspect is deleted from the 
rulebase, other domain-dependent concepts 
start asserting themselves. In fact, in Fig- 
ure 7, group 8 rules with similar rulenames 
( hstd-same , hstd-bad , hstd-good and hstd- 
unavaiT) across different files (hstd.r and op- 
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Group no 6 : 

Total number of rules in group: 19 

Distance:: Min: 2.000000 Max: 6.000000 Mean: 3.688889 

Cohesiveness: 0.443354 Minimum Membership: 0.160000 
27 control-kickoff 0.385737 

194 operator-stop 0.526758 

201 operator-uplink-runway 0.454210 

195 operator-delta-state 0.472484 

196 operator-changed-delta-state 0 . 520917 

197 operator-bf s-no-go 0.398486 

198 operator-bf s-go 0.438764 

199 operator-runway-selection 0.400035 

200 operator-desired-runway-from-operator 0.443254 

204 operator-atmosphere-change 0 . 375280 

202 operator-toggle-tacan 0 . 342885 

203 operator-cant-toggle 0.416190 

205 gndeph-bad 0.443719 

207 gndeph-same 0.490814 

206 gndeph-good 0.452024 

209 hstd-good 0.451576 

208 hstd-bad 0.481044 

210 hstd-saxne 0.534733 

211 hstd-unavail 0.394810 

Group no 12: 

Total number of rules in group: 4 

Distance:: Min: 2.333333 Max: 3.250000 Mean: 2.763889 

Cohesiveness: 1.112825 Minimum Membership: 0.571429 


42 

hstd-bad 

1.229437 

44 

hstd-same 

0.884921 

43 

hstd-good 

1.136364 

45 

hstd-unavail 

1.200577 


Figure 6: Hstd rules - Primary View 
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Group no 8 : 

Total number of rules in group: 24 

Distance:: Min: 2.000000 Max: 9.900000 Mean: 4.437921 

Cohesiveness: 0.377193 Minimum Membership: 0.000000 
27 control-kickoff 0.351796 

211 hstd-unavail 0.353633 

194 operator-stop 0.479850 

201 operator-uplink-runway 0.414068 

210 hstd-same 0.492559 

195 operator-delta-state 0.459732 

196 operator-changed-delta-state 0.487993 

197 operator-bf s-no-go 0.375855 

198 operator-bf s-go 0.405531 

199 operator-runway-selection 0.355490 

200 operator-desired-runway-from-operator 0.391028 

205 gndeph-bad 0.440268 

207 gndeph-same 0.446688 

202 operator-toggle-tacan 0.318074 

203 operator-cant-toggle 0.389968 

43 hstd-good 0.294328 

206 gndeph-good 0.444678 

209 hstd-good 0.472972 

42 hstd-bad 0.316276 

208 hstd-bad 0.487546 

44 hstd-same 0.220726 

138 init-wrong-atmosphere 0.090802 

139 init-right-atmosphere 0 . 148827 

204 operator-atmosphere-change 0 .413935 

Figure 7: Hstd rules - Secondary View 


Group no 5: 

Total number of rules in group: 4 

Distance:: Min: 2.000000 Max: 4.000000 Mean: 3.000000 

Cohesiveness: 1.328788 Minimum Membership: 0.013423 


20 

baro-aif -changed 

1 . 176493 

36 

drag-aif -changed 

1.653500 

310 

tacan-aif -changed 

1.372859 

179 

msbls-aif -changed 

1.112300 


Figure 8: Aif rules - Secondary View 
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erator. r) come together because all of these 
rules deal with an incorrect input value for 
the hstd indicator. However, the hstd indi- 
cator is important in two subphases ( fact- 
assertion and hstd). Once the phase compo- 
nent is deleted, the domain information that 
determines the hstd status pulls these rules 
into the same group. In the primary view 
these rules were in separate groups, 6 and 
12, as shown in Figure 6. 

It is also interesting to note that rules that 
share the concept of modifying the auto- 
inhibit-force flag (aif) in different phases all 
combine together in group 5, see Figure 8. 
This is a functional grouping of rules based 
on actions to be taken when there is a dis- 
crepancy between the previous and current 
values of the aif flag in the barometer, drag, 
tacan and msbls units. An orthogonal view 
of the rulebase comes into perspective with 
this grouping. 

Such a view may be of immense value to 
the maintainer of the rulebase, since func- 
tional dependencies like these can be ex- 
tremely difficult to locate across files, es- 
pecially if the maintainer has not been the 
original developer of the system. Thus, our 
experimental results with the MVP-CA tool 
has demonstrated the feasibility of discover- 
ing significant structures within the rulebase 
by providing a mechanism to structure both 
hierarchically (from detail to abstract) and 
orthogonally (from different perspectives). 

Related Work 

Extraction of meta-knowledge for the pur- 
poses of comprehending and maintaining ex- 
pert systems has been an accepted norm. In 
this section, we examine the role of structur- 
ing for this purpose in some well-established 
knowledge-based systems. 


Systems such as XCON [4, 18] that have 
been in development for more than 10 years 
had to develop a new rule-based language, 
RIME, and rewrite XCON-in-RIME to fa- 
cilitate its maintenance. XCON-in-RIME 
is supposed to make the domain knowledge 
more explicit both in terms of restructuring 
the rules and in terms of exposing the con- 
trol structure for firing of the rules. Thus the 
problem space gets more hierarchically or- 
ganized into different functional aspects, the 
problem solving method is made more ex- 
plicit, a domain-specific classification is im- 
posed on the rules and rule templates are 
created to serve as guides for rule creation. 

Meta-Dendral [6] is a case study in the 
area of acquisition of domain knowledge. 
Meta-Dendral tries to resolve the bottleneck 
of knowledge acquisition through automatic 
generation of rule sets so as to aid the pro- 
cess of formation of newer scientific theories 
in mass spectroscopy. 

TEIRESIAS [7] is built upon the MYCIN 
system to provide a mechanism for effective 
knowledge transfer. TEIRESIAS uses meta- 
rules to encode rule-based strategies that 
govern the usage of other rules. For this pur- 
pose it generates a set of rule models that are 
then used to guide this effort by being sug- 
gestive of both the content and form of the 
rules. These rule models can suggest incom- 
plete areas of the knowledge base, provide 
summary explanations and help during de- 
bugging sessions. TEIRESIAS demonstrates 
the power of analyzing rule sets for experts 
especially when writing new rules. It is very 
helpful to see existing rules that are similar 
to a new rule under consideration so as to 
set the appropriate certainty factors in the 
new rule. Similarity could be suggestive of 
similar premises or similar conclusions. By 
comparing other evidence and other conclu- 
sions, the strength of the proposed rule can 
be estimated in the proper context. In fact, 
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each of the clusterings carries an extra slot 
indicating the context in which the rule set 
applies. 

Although others [11, 13, 14] have at- 
tempted to cluster knowledge bases in or- 
der to abstract and structure the knowledge 
in them, existing approaches are limited in 
two major ways. First, we believe that no 
one single structuring viewpoint is sufficient 
to comprehend a complex knowledge base. 
Second, it is difficult to understand a sin- 
gle knowledge base isolated from an under- 
standing of the underlying application do- 
main. Often clues to the underlying seman- 
tic concepts are provided through descriptive 
names. Even then, the syntactic structure 
alone is rarely sufficient for managing and 
maintaining a complex system. 

Clustering analysis can be used to reveal 
regularities in the knowledge base which can 
suggest possible subdomains of the problem. 
This structuring of the knowledge base is in- 
tended to capture both the explicit and the 
implicit knowledge in the knowledge base. 
The point of interest of such an analysis 
should not be the clusters themselves, but the 
principles and ideas suggested by the clus- 
ters. Such groups would allow one to ab- 
stract away from the point of view that each 
rule is a procedure call and look at the sys- 
tem from higher semantic levels. Each such 
group or unit can then be viewed as a proce- 
dure having a well-defined interface to other 
rule-groups. Once a rule base is decomposed 
into such “firewalled” units, studying the in- 
teractions between rules would become more 
tractable. 

Due to the declarative style of program- 
ming in knowledge-based systems, the gen- 
eration of clusters to capture significant con- 
cepts in the domain seems more feasible than 
it would be for procedural software. By 
using knowledge-based programming tech- 


niques one is much closer to the domain 
knowledge of the problem than with pro- 
cedural languages. The control aspects of 
the problem are abstracted away into the 
inference engine (or alternatively, the con- 
trol rules are explicitly declared.) Genera- 
tion of a model of the problem domain can 
be accomplished through clustering. The ex- 
istence of a model of the domain would bene- 
fit the analysis of other knowledge-based sys- 
tems within that domain by providing seeds 
for cluster formation. In addition, the use 
of a domain model to assist in the develop- 
ment of new knowledge-based systems is a 
promising research direction. 

Conclusions 

Knowledge-based systems have the poten- 
tial to greatly increase the capabilities of 
many aerospace applications such as Space 
Station, manned and unmanned spacecraft 
and civilian and military air transport. Au- 
tomated systems that are knowledge based 
need to be deployed aboard these missions 
to reduce manpower support. Failure of 
such systems, however, can result in loss of 
life and of substantial financial investment. 
Hence these systems need to be highly reli- 
able. Whereas DOD standards for conven- 
tional software have been developed, such as 
ADA-9x, a credible development and valida- 
tion methodology for knowledge-based sys- 
tems is currently lacking. Acceptance of 
knowledge-based systems software for crit- 
ical missions is very much dependent on de- 
velopment of effective software engineering 
and validation techniques. A structured ap- 
proach to management and maintenance of 
such systems would go a long way towards 
dispelling the myth that expert systems are 
inherently unreliable and that nothing can 
be done about it. 
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Expert systems have a wide commercial 
applicability. Liability issues arising out of 
improper functioning of such systems de- 
mand that any risk to life or property be ei- 
ther totally eliminated or at least minimized. 
Hence, it is imperative to develop rigorous 
and automatic testing tools for the verifica- 
tion and validation of knowledge-based sys- 
tems. An integrated environment for expert 
system verification and validation, such as is 
proposed by MVP-CA, would overcome this 
barrier, opening them up for a broad range 
of important applications. An integrated 
system for performing V&V on structured 
knowledge bases will enhance the reliability 
of knowledge-based software and bridge its 
current gap with conventional systems. 
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